The Relationship between Extremum Statistics and Universal Fluctuations 
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The normalized probability density function (PDF) of global measures of a large class of highly 
correlated systems has previously been demonstrated to fall on a single non Gaussian "universal" 
curve. We derive the functional form of the "global" PDF in terms of the "source" PDF of the 
individual events in the system. A single parameter distinguishes the global PDF and is related 
to the exponent of the source PDF. When normalized, the global PDF is shown to be insensitive 
to this parameter and importantly we obtain the previously demonstrated "universality" from an 
uncorrelated Gaussian source PDF. The second and third moments of the global PDF are more 
sensitive, providing a powerful tool to probe the degree of complexity of physical systems. 
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The study of systems exhibiting non Gaussian statis- 
tics is of considerable current interest. These statistics 
are observed to arise in finite sized many body systems 
exhibiting correlation over a broad range of scales. The 
apparent ubiquitous nature of this behavior has led to 
interest in self organized criticality |l],|2| as a paradigm; 
other highly correlated systems include fluid turbulence. 
Two recent results have highlighted the connection be- 
tween extremum statistics and highly correlated systems. 
The probability density function (PDF) of fluctuations 
in power needed to drive an enclosed rotating turbulent 
fluid at constant angular frequency has been measured 
over 2 decades in Reynolds number. Intriguingly, when 
the PDF P(E) of these series of experiments were nor- 
malized to the first two moments they were found to fall 
on a single non Gaussian "universal" curve This 
same universal curve was later identified in a study of 
the two dimensional X-Y model, a numerical model for 
magnetization near the critical point 0]. To obtain the 
universal curve, the PDF of a global measure, namely the 
magnetization summed over the entire system, is again 
normalized to the first two moments. It was suggested Q 
that these two disparate systems share the same statis- 
tics as they are both critical. The functional form of the 
"universal" curve was found for the X-Y model and was 
shown to be of the form 

P{E) = K(e y - eV ) a with y = b(E- s) (1) 

with a = it/2 and K,b,s obtained by normalizing the 
curve to the first two moments. Crucially, it was then 
demonstrated [|| that this curve was also in reasonable 
agreement with appropriately chosen normalized global 
measures for a range of numerical models of highly cor- 
related systems. It was suggested that this behavior is 
related to the extremum statistics that arises from a pro- 
cess that is highly correlated. 

In this Letter we give a comprehensive analysis of ex- 
tremum statistics in the context of finite sized systems. 
Our aim is to determine the relationship between the un- 
derlying "source" PDF of a given process and the PDF 
of some global measure. Given that events occur over a 
range of sizes, and that each event represents some quan- 
tity, magnetization, or energy dissipation say, we obtain 
a relationship between the "source" PDF of the event 
size, and the PDF of a global measure, the total magne- 
tization, or energy dissipation over the system. We find, 
as suggested in J6|, that the global PDF, when normal- 
ized to the first and second moments is essentially of the 
form of equation ([!]) . Crucially however we find that the 
"universal" curve for the global PDF, that is, equation 
(|l|) with a = it/ 2 is not uniquely a property of a source 
PDF of a correlated process. Instead, in a finite sized sys- 
tem, distributions of this form with a in the range [1, 2] 
arise from uncorrelated samples from a source PDF rang- 
ing from exponential through Gaussian to power law, the 



value of a being determined by the source PDF. When 
normalized to the first and second moments these curves 
are only distinguishable asymptotically. Hence in reality 
the "universal" curve describes, to within typical experi- 
mental or numerical statistical uncertainties, distribution 
(|l|) with a in the range [1,2]. 

In many physical situations it is relatively straightfor- 
ward to measure the PDF of some global quantity such as 
power dissipation in the driven turbulent fluid. In order 
to understand the underlying process we require details 
of the distribution of the source PDF. In particular, if this 
process is highly correlated, the source PDF of individ- 
ual events is anticipated to be power law and we wish to 
i) distinguish this unambiguously from an uncorrelated 
Gaussian process and ii) measure the exponent. A direct 
measurement of the source PDF requires the challeng- 
ing measurement of event sizes over many decades, but 
if we can relate the power law exponent to the form of 
the global PDF there is the possibility to remote sense 
this exponent. Normalizing the global PDF to the first 
and second moments is an insensitive method to find a; 
we show that for finite sized systems the higher order 
moments provide a more feasible method. 

The first step is to obtain the PDF of some global quan- 
tity from that of the source PDF that describes individ- 
ual events. Consider a finite sized system of dimension 
D which at any instant in time has patches of activity 
on various length scales up to the system size Lb- The 
patches are drawn from the (time independent) source 
probability N(L) of a patch of length L. These patches 
can represent sites involved in an avalanche in a sandpile, 
vortices in a turbulent fluid, ignited trees in a forest fire, 
or sites with nonzero magnetization in the X- Y model. 
Associated with the active sites is some quantity of inter- 
est, Q say, for example energy or magnetization, which 
we take to be given by Q = L D . There will be some 
maximum Q b corresponding to the (extremely rare) con- 
figuration with the highest possible value of Q, that is, 
highest energy or magnetization, that can be realized by 
the system. The total value of Q over the system at any 
instant arises from the distribution of the patches at that 
instant Nj(L); 

Qj= ^ QN J (Q)dQ = [ B L D Nj(L)dL (2) 
Jo Jo 

where Nj is the distribution of an (unknown) realizable 
ensemble of patches (continuous limit N(L)) that fits 
within the finite sized system, and the integral is over 
the system. Since N is normalized, N(L)dL = N(Q)dQ. 
We now wish to evaluate the PDF of the Qj . This arises 
from the many ensembles of the system, for the j th en- 
semble the total value of Q can alternatively be writ- 
ten as a sum over the Mj (unknown) individual patches 
{Li}j, 1 < i < Mj. If Nj(L) is monotonically decreasing 
(from maximum Nqj to zero) we can generate each of the 
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{Li}j by choosing Mj random numbers iVj in the range 
[0,-ZVoj], with uniform probability distribution P(JV,). If 
we then insist that P(Ni) = P{Lj), for each realization 
the random iVj will each lie in one of the Mj uniform in- 
tervals SNi, giving Li patches which lie in corresponding 
(nonuniform) intervals 5Li obtainable in principle by in- 
verting Nj(Li). We can then write the sum of the patches 
in the j th ensemble: 



M, 



M, 



(3) 



i=l 



If the gradient of N(L) is near monotonic, 5Ni/5Li 
-(dN/dL) so that 



P{Ni)5N t L\ 



-dN/dL J Noj dN/dL 



L dN 



(4) 



For a source PDF N(L) that is exponential, Gaussian 
or inverse power law for large L dN/dL << L D for 
small N, that is, large L (large Q). Hence the domi- 
nant contribution to Qj is that of the largest patch of 
activity. Thus the statistics of the PDF of Q, P(Q) 
will be extremum statistics, P(Q) — P m (Q), the normal- 
ized PDF of the maximum drawn from the ensembles. 
Given that the maximum for the j ensemble is given 
by Qj = max{Qi, ..Qm,}, where Q Mj < Qb, that is, Mj 
finite, the PDF for Q* is given by 



P m (Q*) = MN(Q*)(l-N > (Q*)) 



*\\M-1 



(5) 



where M is the average of Mj over the ensembles and 

/•Qb roo 

N>(Q*)= / N(Q)dQ~ / N(Q)dQ (6) 
Jq* Jq* 

We now obtain P m for large finite M, Q. For a general 
PDF N(Q), (1 - 7V>) M = exp(-Mg(Q*)) where 

g (Q*) = -ln(l-N > (Q*))~N > + ^ (7) 

We now choose a characteristic value of Q*, namely Q , 
such that for any of the j ensembles 



Mg(Q*) = MN > (Q*)+M 



N*(Q*) 



(8) 



Using this definition and the form for g(Q*) (0) we obtain 
g'(Q*) = —N(Q*) to lowest order in an expansion in 
q/M. 

We now consider specific source PDF N(Q). If N(Q) 
falls off sufficiently fast in Q, i.e. is Gaussian or exponen- 
tial we can consider lowest order only giving g(Q*) ~ -/V> 
0,0 and q = MN>(Q*). After some algebra, expanding 
in Q* near Q gives 



P(Q) = Pm(Q) = Pm(Q*) ~ (e u ^) a 



(9) 



with 



N'(Q*)N>(Q*) 



u = ln(MJV>(Q*)) 



N 2 (Q* 
N(Q*) 



AQ* 



(10) 
(11) 



where AQ* = Q* —Q*. For N(Q) exponential (|ll| ) gives 
a = 1 (see [0). For N(Q) Gaussian we cannot obtain 
a exactly but as we shall see it is instructive to make an 
estimate. Given N(Q) — Noexp(—XQ 2 ) in the above we 
obtain P m — P rn exp(R(u)) with 



R = - 



ln 2 (g) 

4A(3* 2 



+ fi |l + ^- 



4AQ* 2 / 4AQ* 



(12) 



where we have used u = —2\Q*AQ* and u = u + \n(q). 
To lowest order in AQ*/Q* (i.e. Q* — * oo) we have PDF 
(|l|) with o=l, but to next order, that is, neglecting the 
term in u 2 only in (|l^) we have this PDF with 



a = 1 



21n(g) \ 
4AQ* 2 / 



^1 



(13) 



Power law source PDF N(Q) fall off sufficiently slowly 
with Q that we need to go to next order in AQ* /Q*. If 
we consider normalizable source PDF 



N(Q) = 



Nn 



(i + Q 2 ) 



21*; 



(14) 



then for large Q the above method yields that P(Q) is 
given by the form <M) but with 



AO* AO* 
u=-ln(o)-]n(<?)-(2ft-l)^-(l- ' 



<} 



2Q* 



(15) 



and a — 2k /2k — 1. To lowest order, neglecting the 
(AQ*/Q*) 2 term @ reduces to ©. Hence a power 
law source PDF has maximal statistics P m {Q) which, 
when evaluated to next order, have distribution (Q) with 
a correction that is non negligible at the asymptotes, con- 
sistent with the well known result due to Frechet ( ||[l0| ) . 

The above results should be contrasted with that of 
Fischer and Tippett ||. Central to || and later deriva- 
tions is that a single ensemble of NM patches has the 
same statistics as the N ensembles (of M patches), of 
which it is comprised. The fixed point of this expression 
for arbitrarily large N and M is a = 1 for the exponential 
and Gaussian PDF, and the Frechet result for power law 
PDF. Here, we consider a finite sized system so that al- 
though the number of realizable ensembles of the system 
can be taken arbitrarily large, the number of patches M 
per ensemble is always large but finite. Importantly, the 
rate of convergence with M depends on the PDF N(L). 
For an exponential or power law PDF we are able to re- 
sum the above expansion exactly to obtain a; and conver- 
gence will then just depend on terms 0(1/M) and above. 
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This procedure is not possible for N(Q) Gaussian, in- 
stead we consider the characteristic Q*, that is Q* which 
for M arbitrarily large should be large also. Rearrang- 
ing ||) to l owest or der for N(Q) = N exp(— XQ 2 ) yields 
y/XQ* ~ \/ln{M) implying significantly slower conver- 
gence. 

We now have the intriguing result that for a wide range 
of source PDF the PDF of a global measure P(Q) is es- 
sentially a family of curves that are approximately Gum- 
bel in form and are asymmetric with a handedness that 
just depends on the sign of Q; we have assumed Q pos- 
itive whereas one could choose Q negative (with L pos- 
itive) in which case N(Q) — > N(\ Q |). The single pa- 
rameter a that distinguishes the global PDF then just 
depends on the source PDF of the individual events. For 
N(Q) exponential we recover the well known result |?]JTT| 
o = l. For a power law source PDF a is determined by 
k as above. For a Gaussian source PDF a^l. 

To compare these curves we normalize P(Q) = 
Pm{Q*)- For Gaussian and exponential source PDF we 
have 

P( y ) =K{e u ~ eU ) a with u = b(y-s) (16) 
This has moments 



y n P(y)dy 



(17) 



which converge for all n; we insist that Mq — 1, Mi = 
and M 2 = 1. The necessary integrals can be expressed 
in terms of derivatives of the Gamma function T(a) fl2]| 
and we obtain after some algebra 



*'(a), K 



r(a) 



3 a ln(o) 



|(*( a )-ln(a)) (18) 



where Vt'(a) and its derivative w.r.t. a, $'(a) have their 
usual meaning. The ambiguity in the sign of b (and hence 
s) corresponds to the two solutions for P(Q) for positive 
and negative Q. 

For power law source PDF Q we use the Frechet 
distribution ( |],[l(|) which we first write as ( p^ ) with 



a + /31n(l + ^) 



(19) 



which reduces to the form of ( jig ) for AQ* /Q* <C 1. From 
( |l4] , p^| ) we identify f3 — — (2k — 1). Again we insist that 
Mq = 1, Mi = and M 2 = 1 and obtain 



a = —f3 In 



a 3 



r(i + i/p) 



K = ±pa a [r(l + 2/f3) - T 2 (l + 1/(3)] ' 

r(i + JO 



(20) 



G 



r(i + §)-i*(i + i) 



For {3 — ► 00 with (3/G finite these equations reduce to 
© with 6 = -/3/G. 

We can now plot the "universal" curves, that is, nor- 
malized to the first two moments. Experimental mea- 
surements of a global PDF P(E) normalized to Mq would 
be plotted M 2 P versus (E-M 1 )/M 2 . For the Frechet it is 
straightforward to show that the moments of order n ( |l7j ) 
exist for 2k > n + 1 and therefore these curves exist for 
power law of index 00 > 2k > 3 i.e. 1 < a < 3/2. This is 
significant since processes exhibiting long range correla- 
tions typically have k lower than this Q . Inset in Figure 
1 we plot the normalized Frechet PDF for k — 3, 5, 100 
and the PDF (|l|) with a = 1. In the limit k — * 00, a — > 1 
and the normalized Frechet PDF tends to the a = 1 
limit of (Q), hence for k = 100 these are indistinguish- 
able and differences between the PDF appear on such a 
plot around the mean for k < 3 approximately. In the 
main plot we show normalized distributions of the form 
(|l|) for a = 1, 7r/2 and 2. It is immediately apparent that 
the curves are difficult to distinguish for several decades 
in P(y) and either numerical or real experiments would 
require good statistics over a dynamic range of about 4 
decades which is not readily achievable. 

Since the second moment M 2 does not exist for k < 3/2 
we cannot consider curves of a > 3/2 generated by 
power law source PDF; however such values (in particu- 
lar a — 7r/2) were identified for the "universal" curves in 
turbulence experiments and a variety of models of corre- 
lated systems |^,^|. We now demonstrate that these are 
straightforward to produce. On Figure 1 we have over 
plotted (*) the global PDF generated by a source PDF 
that is uncorrelated Gaussian, calculated numerically. 
We randomly select M uncorrelated variables Qj,j = 
1, M and to specify the handedness of the extremum dis- 
tribution, the Qj are defined negative and N(\ Q |) is 
normally distributed. This would physically correspond 
to a system where the global quantity Q is negative, i.e. 
power consumption in a turbulent fluid, as opposed to 
power generation. To construct the global PDF we gen- 
erate T ensembles, that is select T samples of the largest 
negative number Q* — mm{Qi..Qjvf}, i = l,T. For the 
data shown in the figure M = 10 5 and T = 10 6 ; this 
gives V~XQ* ~ y/ln(M) ~ 3 so that for the Gaussian we 
are far from the a = 1 limit 

The numerically calculated PDF lies close to a = n/2. 
Such a value of a on these "universal" curves is therefore 
not strong evidence of a correlated process as suggested 
by [|| . Generally, plotting data in this way is an insen- 
sitive method for determining a and thus distinguishing 
the statistics of the underlying physical process. 

The question of interest is whether we can determine 
the form of the source PDF from the global PDF from 
data with a reasonable dynamic range. We consider two 
possibilities here. First, a uniformly sampled process will 
have the most statistically significant values on the uni- 
versal curve near the peak. For both the PDF the peak 
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is at u — and is at P(u = 0) = Ke~ a with K given by 
( |l8| ) and j20| ) respectively. The latter applies to k > 3/2; 
for smaller k > 1 we may use Mo = 1 , Mi = plus a 
condition on P(u = 0) to obtain a. A more sensitive in- 
dicator may be the third moment of P which after some 
algebra can be written as 



M» 



(*'(«))* 



(21) 



for a Gaussian or exponential source PDF i.e. with © 
and 



M 3 



ra 



(3/ 



3r(i + |)r(i + i) + 2r 3 (i + i 



r(i 



r 2 (i + ^) 



(22) 



for a power law source PDF i.e. with (pjj); the latter 
converging for 2fc > 4. Again these refer to one of the 
two possible solutions for P(Q); the other solution cor- 
responding to y — > — y, A/3 — > — M3. We can compare 
these two methods by noting that for PDF of the form 
(|j) with a = 1, 2 the corresponding values of P m differ by 
~ 7.9% whereas M 3 differs by ~ 32%. For Frechet PDF, 
the variation in P(u — 0) is most significant for smaller 
k, for example with k — 3,4 P(u — 0) differs by ~ 15% 
whereas M 3 differs by ~ 30%. 

In conclusion, we have shown that the statistics of fluc- 
tuations in a global measure of a finite sized system, such 
as total energy dissipation in a turbulent fluid, or total 
magnetization in a ferromagnet are generally given by 
extremum statistics. The PDF of the global measure 
is then one of a family of curves whose moments have 
been determined in terms of a single parameter a which 
in turn quantifies the PDF of the underlying "source" 
process, such as the PDF of individual energy release 
events or patches of magnetization. When normalized to 
the first and second moments these curves are insensitive 
to a and fall close to the single "universal" curve previ- 
ously identified as a property of a large class of highly 
correlated systems ||, over the range achieved by previ- 
ous real or numerical experiments. In particular, we find 
that the global PDF of an uncorrelated Gaussian pro- 
cess is 'Gumbel' (|l|) distributed with a ~ tt/2, providing 
a straightforward explanation for the previously demon- 
strated "universality" . Finally we suggest that the peak, 
or the third moment of the global PDF is a more sensitive 
indicator of the source PDF. This is a powerful tool to 
probe the exponents of physical systems where the source 
PDF is difficult to measure but provides a signature of 
the degree of complexity of the system. 
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FIG. 1. The normalized PDF (|) with a = 2, tt/2, 1 (the 
right hand asymptotes of the curves intersect the ordinate in 
that order from left to right). Overplotted is the numerically 
evaluated global PDF of an uncorrelated Gaussian process. 
Inset are Frechet PDF normalized to the first two moments 
for source PDF exponents 2k, k = 3, 5, 100, on the same scale. 
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